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Abstract —In today’s world, individuals interact with each other in more complicated patterns than ever. Some individuals engage 
through online social networks (e.g., Facebook, Twitter), while some communicate only through conventional ways (e.g., face-to-face). 
Therefore, understanding the dynamics of information propagation among humans calls for a multi-layer network model where an 
online social network is conjoined with a physical network. In this work, we initiate a study of information diffusion in a clustered 
multi-layer network model, where all constituent layers are random networks with high clustering. We assume that information 
propagates according to the SIR model and with different information transmissibility across the networks. We give results for the 
conditions, probability, and size of information epidemics, i.e., cases where information starts from a single individual and reaches a 
positive fraction of the population. We show that increasing the level of clustering in either one of the layers increases the epidemic 
threshold and decreases the final epidemic size in the whole system. An interesting finding is that information with low transmissibility 
spreads more effectively with a small but densely connected social network, whereas highly transmissible information spreads better 
with the help of a large but loosely connected social network. 

Index Terms —Information Propagation, Clustered Multilayer Networks, Percolation Theory, Random Graphs. 
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1 Introduction 

The study of dynamical processes on real-world complex 
networks has been an active research area over the past 
decade. An interesting phenomenon that occurs in many 
such processes is the spreading of an initially localized effect 
throughout the whole (or, a very large part of the) network. 
These events are usually referred to as (information) cascades 
and can be observed in processes as diverse as adoption of 
cultural fads, the diffusion of belief, norms, and innovations 
in social networks (T], (^, disease contagion in human and 
animal populations failures in interdependent power 

systems rise of collective action to joining a riot 

and the global spread of computer viruses or worms on the 
Web jp, KB). 

This work focuses on an important class of dynamical 
process known as the information propagation or simple 
contagions; this is to be contrasted with complex contagions 
often referred to as influence propagation (W). Although 
well-studied in the past across various domains, the infor¬ 
mation diffusion problem has recently taken a new form 
and dimension by the emergence of online social networks 
such as Facebook, Twitter, etc. In particular, due to the exis¬ 
tence of multiple online social networks, information is now 
likely to spread among the population in an unprecedented 
speed and scale. Although there has been a recent surge of 
research on multi-layer and multiplex networks (e.g., see 
ED), the current literature still falls short in fully 
quantifying this phenomenon. For instance, Yagan et al. 
analyzed |j^, ||^| the diffusion of information in a multi¬ 
layer network, but only for the cases where all constituent 
layers are generated by the configuration model p7) ; see also 
d), ED for works that are in the same vein. However, 
the configuration model produces | [20) , pT] networks that 
can not accurately capture some important aspects of real- 


world social networks, most notably the property of high 
clustering | [B| , p3) . Informally known as the phenomenon 
that "friends of our friends" are likely to be our friends, 
clustering has been shown to impact significantly the dy¬ 
namics of various diffusion processes p4| , p4) , |25) . 

With these in mind, we study information propagation in 
clustered multi-layer networks. In particular, we consider a 
model where all constituent layers are random networks with 
clustering as introduced by Miller p0| and Newman pT) , i.e., 
they are generated randomly from given distributions speci¬ 
fying the number of single edges and triangles for any given 
node; see Section 2.1 for details. Our modeling framework 
consists of a physical network where information spreads 
amongst people through conventional communication media 
(e.g., face-to-face communication, phone calls), and over¬ 
laying this network, there are online social networks of¬ 
fering alternative platforms for information diffusion, such 
as Facebook, Twitter, Google-t, etc. The coupling across 
these networks results from nodes they have in common, 
i.e., individuals who participate in multiple networks si¬ 
multaneously; see Section 2.2 for details of our multi-layer 
network model where the coupling level is tunable. 

In this setting, we analyze the propagation of infor¬ 
mation assuming that information propagates according to 
the SIR epidemic modeE] Namely, an individual is either 
susceptible (S) meaning that she has not yet received a 
particular information, or infectious (I) meaning that she 
is aware of the information and is spreading it to her 
contacts, or recovered (R) meaning that she is no longer 


1. The analogy between the spread of diseases and information has 
long been recognized |26| and the SIR epidemic model is commonly 
used in similar studies; e.g., see |27|, |28|. 
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spreading the information. Let denote the probability 
that an infectious individual i transmits the information 
to a susceptible contact j. Throughout, we account for the 
fact that individuals' information spreading behaviors may 
differ from one network to another; e.g., one may be more 
active in Facebook than Twitter, or vice versa. The varying 
rate of information diffusion across different social networks 
is captured in our formulation by having the transmissibility 
Tij depend on the network that the link i ^ j belongs to; 
see Section IQ] for details. 


Our main contributions are as follows. We solve ana¬ 
lytically for the threshold, probability, and mean size of 
information qridemics, i.e., cases where information starts 
from a single individual and reaches a positive fraction of 
the population; see Section 2.4 for precise definitions. Our 


analytical approach is based on mapping the SIR propaga¬ 
tion model to a bond percolation process and then utilizing a 
multi-type branching process to solve for the quantities of 
interest; the isomorphism between the SIR model and bond 
percolation has been established for certain cases in |^, 
(^ . The analytical results are validated and extended by 
computer simulations. 


Several interesting conclusions are drawn from these 
results. For example, we show that increasing the level of 
clustering in any one of the layers increases the epidemic 
threshold and decreases the final epidemic size of the 
whole system. Put differently, we show that i) clustering 
makes it more difficult for a single person to spread the 
information to the masses; and ii) even if the information 
reaches to the masses, we show that clustering decreases the 
total fraction of individuals informed. We also demonstrate 
how the overlap between the constituent networks affect 
the information propagation dynamics, particularly through 
impacting the degree-degree correlations. For instance, we 
show that an online social network that is small in size 
but large in mean connectivity is more effective (resp. less 
effective) in facilitating the propagation of information with 
low transmissibility (resp. high transmissibility) as compared to 
a large social network with smaller mean connectivity, with 
the total number of edges fixed in both cases. 


Our general framework contains non-clustered multi¬ 
layer networks and single-layer clustered networks as spe¬ 
cial cases. In addition, given that information propagation 
problem is studied via bond percolation over a multi¬ 
layer network, our work can also be useful in the context 
of robustness against random attacks. Finally, although the 
problem is motivated here in the context of information 
propagation, network coupling is relevant in many simple 
contagion processes including diffusion of diseases m, H); 
e.g., a small community may consist of three coupled net¬ 
works corresponding to three venues people can interact at: 
households, hospitals, and schools |^, p^ . 

The rest of the paper is organized as follows. In Section 
we introduce the models applied in this study and the 
problem to be considered. In Section we introduce the 
related technical background. In Section we present and 
derive the main results of this work, while in Section we 
confirm our analytical results via computer simulations. We 
conclude the paper in Section]^ 


2 Problem Formulation 

In this section, we give precise definitions of our system 
model and then describe problems that shall be studied. 


2.1 Random Graphs with Clustering 

Our modeling framework is based on random networks 
with clustering as introduced independently by Miller p3| 
and Newman [ [3^ . This model takes its roots from the 
widely used configuration model (TT) that generates a network 
randomly according to a given degree distribution. Namely, 
consider a vertex set V = 1, 2,..., n, where each vertex is 
independently assigned a random number of stubs accord¬ 
ing to a probability distribution {pk}^=o' i-®-' degree di 
of vertex i equals k with probability pk for any positive 
integer k. Then, stubs are randomly paired with each other 
to form edges until no free stubs is left; see Figure for an 
illustration of the configuration model. 




Fig. 1. Illustration of process of Configuration Modei. 

It is known that (17) , (34) configuration model generates 
tree-like graphs with number of cycles approaching to zero 
as the number of nodes gets large. ITowever, most social 
networks exhibit high clustering, anecdotally known as the 
likelihood of a "friend of a friend" to be one's friend. Put 
differently, real-world social networks are not tree-like and 
instead have considerable number of cycles, particularly of 
size three; i.e., triangles. With this in mind. Miller p3] | and 
Newman proposed a modification on the configuration 
model to enable generating random graphs with given 
degree distributions and tunable clustering. 

The model proposed in p3) , p^ | is often referred to 
as random networks with clustering and is based on the 
following algorithm. Consider a joint degree distribution 
{Pst }?t=o that gives the probability that a node has s single 
edges and t triangles; e.g., see node 2 in Figure [O] that has 
two single edges and one triangle. Namely, each node will 
be given s stubs labeled as single and 2t stubs labeled as 
triangles with probability pst, for any s,t = 1,2,.... Then, 
stubs that are labeled as single are randomly joined to form 
single edges that are not part of a triangle, whereas pairs 
of triangle stubs from three nodes are randomly matched 
to form triangles between the three participating nodes; 
of course the total degree of a node will be distributed 
by pk — Jfs f.s+2t=kPst- As in the standard configuration 
model, it can be shown that the number of cycles formed by 
single edges goes to zero as n gets large, and so does the 
number of cycles of length larger than three (Tt) . 

The resulting level of clustering of the model described 
above can be quantified in a number of ways. Here we con¬ 
sider two widely used metrics known as the global clustering 
coefficient p^ and local clustering coefficient p5) ; see (36)- 
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Fig. 2. Illustration of a random network with clustering. In part (a), node 
2 has 2 single edges and one triangle, whereas in part (b) i t has zero 
single edges and two triangles. For the network in Figure |2.l| the global 
clustering coefficient is 0.2 whi le th e local clustering coetticient is 0.3, 
while for the network in Figure [2?!] these coefficients are given by 0.4 
and 0.7, respectively. 


(38| for other definitions of clustering coefficient proposed 
in the literature. Namely, the global clustering is defined via 


a 


global 


3 X (number of triangles in network) 
number of connected triples 


( 1 ) 


where “connected triples" means a single vertex connected 
by edges to two others. On the other hand, the local cluster¬ 
ing is defined as the average 



I 


( 2 ) 


where Ci denotes the clustering coefficient for node i given 
by 

^ number of triangles connected to vertex i 

* number of connected triples centered on vertex i 

(3) 


Here, n* is the number of nodes whose Ci is well-defined in 
the network; i.e., number of nodes where the denominator 
at <1D is nonzero. The difference between the two definitions 
of clustering is illustrated in Figures 2.1 and 2.1 where 


networks with the same degree distribution are considered. 

It was shown in that both Cgiobai and Ciocai are 
positive in the random austered network model, while both 
quantities approach to zero with increasing network size in 
the standard configuration model. 


2.2 Multilayer Network Models with Clustering 

In this paper, we consider a multilayer network where 
each layer is generated independently and constitutes a 
random graph with clustering as introduced in Section 


the arguments can easily be extended higher number of 
layers. Namely, we let W and F denote the two constituent 
layers of networks with the possible motivation that W 
models the physical contact network among individuals, i.e., 
models face-to-face relationships, while network F stands 
for an online social network, say Facebook. In line with this 
terminology, we assume that the network W is defined on 
the vertices Af = {1,..., n}, while F contains only a subset 
of the nodes in Af to account for the fact that not every 
individual participates in online social networks; see Figure 
l^for an illustration of the two-layer network model we are 
considering. 


2.1 For brevity, we only consider two layers but most of 


To specify this model further, we assume that each vertex 
in AA participates in F independently with probability a G 
(0,1], leading by the Strong Law of Large Numbers to 

- —ta.s. Oi (4) 

n 

where AJw denotes the set of vertices in network F; here —ta.s. 
denotes convergence in almost sure sense with n growing 
unboundedly large. In words, this implies that the fraction 
of nodes that belong to F is a in the large n limit. The case 
where |A/f| = o(n) has been considered in j6) and it was 
shown that most properties pertaining to the propagation 
information are unaffected by the existence of the upper 
layer F; i.e., when the online social network has a negligible 
size compared to the whole population, it does not impact 
the threshold or size of information epidemics. 

As mentioned already, we assume that both F and W 
are random networks with clustering. In particular, we let 
s,t = 0,1,... } and s,t = 0,1,...} denote the 
joint distributions for single edges and triangles for F and 
W, respectively. Then both networks are generated inde¬ 
pendently according to the algorithm described in Section 
|2.l| and they are denoted respectively by F = F(n; a,p^f) 
and W = W(n;p^j). We define the multi-layer network 
H as the disjoint union El = F ]J W and represent it by 
EI(n; Here, the disjoint union operation implies 

that we still distinguish F-edges from W-edges in network 
El, and this is done to accommodate the possibly different 
rates (or, even rules) of information propagation across the 
two networks. To this end, an equivalent representation of 
El would be a multiplex network with different types (or, 
colors) of edges. 

With these definitions in mind, let dfs and d^s to denote 
the random variables corresponding to the number of single 
edges for a vertex in F and W, respectively, while nft and n^t 
are defined similarly for the number of triangles assigned; 
i.e., the degree of a node from triangle edges in F is given by 
dft = ‘inft and similarly for Then the colored degree d 
of a vertex is given by 

d = jdfsj2nft^ d'lusjiri-ujt) (3) 

meaning that the vertex has dfs single edges and 2nft triangle 
edges in network F, and d^s single edges and triangle 
edges in network W. Under the assumptions enforced here, 
the distribution of this colored degree is given by 

Pd = {o^Pdf^nft + (1 - a)lM/s = 0 A n/t = 0]) 

( 6 ) 

where the term (1 — a)l[d/s = 0 A nft = 0] accounts 
for the fact that if the node does not belong to F (which 
happens with probability 1 — a), then its degree from single 
and triangle edges will both be zero. 

2.3 Information Propagation Model: SIR 

Consider the diffusion of a piece of information in the 
multi-layer network El which starts from a single node. 
We assume that information spreads from a node to its 
neighbors according to the SIR epidemic model. In this 
context, an individual is either susceptible (S) meaning that 
she has not yet received a particular item of information, or 
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Social Network (F) 



Physical Network (W) 



Fig. 3. Nodes in the upper circle and lower circle indicate the individuals in social network and physical network respectively. The nodes connected 
by a red line cross two networks mean they are the same individual existing in two networks. Green nodes in the upper circle belong to F, while 
blue nodes in W. Some of the nodes connected across the two networks by a red line indicates the fact that they represent the same individual. 


infectious (I) meaning that she is aware of the information 
and is capable of spreading it to her contacts, or recovered 
(R) meaning that she is no longer spreading the information 
(26]-p8). As in 1^ , we assume that an infectious individual 
i transmits the information to a susceptible contact j with 
probability = 1 — Here, denotes the rate of 

contact over the link from i to j, and is the time i keeps 
spreading the information; i.e., time i remains infectious. 

It is expected that information propagates over the phys¬ 
ical and social networks at different rates, which manifests 
from different probabilities Tij across links in this case. 
Specifically, let Tf- stand for the probability of information 
transmission over a link (between and i and j) in W and 
let t/j denote the probability of information transmission 
over a link in F. For simplicity we assume that Tf- and 
T/j are independent for all distinct pairs i, j = 1,... ,n. 
Furthermore, we assume that the random variables rfi and 

‘■J 

are independent and identically distributed (i.i.d.) with 
probability densities Pw{r) and Pw{t), respectively. We find 
it useful to define as the mean of Tf-; i.e., 

POO POO 

:= (T;™) = 1 - P^[r)P^{T)drdT. 

We refer to as the transmissibility over W and note that 
0 < Tiu < 1. In the same manner, we assume that r{j and t( 
are i.i.d. with respective densities Pf{r) and Pfir) leading 
to a transmissibility Tf over F. 

As shall be discussed in Section |2.4| under certain con¬ 
ditions, it can be assumed that information propagates over 
W (resp. over F) as if all transmission probabilities were 
equal to (resp. to T/), for the purposes of computing the 
threshold, probability, and expected size of epidemics. 


2.4 Problems of Interest 

We consider the propagation of information (or, a disease) 
in H as explained in Section 2.3 The outbreak is triggered 
by infecting a randomly selected node and propagates in 
the network according to the SIR model. Given the mono¬ 
tonicity of the SIR process (^ , a steady-state will always 
be reached where all nodes are either recovered or susceptible. 
The final size of an outbreak is defined as the number of 
nodes that are recovered at the steady-state, and its relative 


final size is its final size divided by the total size n of the 
network. Following |j^, we define a self-limited outbreak as 
an outbreak whose relative final size approaches zero, and 
an epidemic to be an outbreak whose relative final size is pos¬ 
itive, both in the limit of large n . There is a critical boundary 
in the space of all network parameters, often defined as the 
epidemic threshold, or epidemic boundary, that separates the 
cases for which the probability of an epidemic is zero (i.e., 
sub-critical, or non-epidemic parameter regime) from those 
that lead to P[epidemic] > 0 (i.e., super-critical, or epidemic 
regime), again with n —t oo. 

With these definitions in place, this work seeks to iden¬ 
tify i) the epidemic boundary; ii) the relative final size 
of epidemics in the super-critical case; and iii) the exact 
probability P[epidemic] in the super-critical regime. 

As we seek to study several properties of simple con¬ 
tagions as outlined above, a first step will be to observe 
that under certain conditions, the SIR propagation model is 
isomorphic to a bond percolation process | j40) . More specifically, 
assume that each edge in W (resp. F) is occupied - meaning 
that it can be used in spreading the information, disease, 
etc. - with probability T^, (resp. Tf) independently from 
all other edges. Here, and Tf are transmissibility pa¬ 
rameters calculated as the mean probability of transmission 
between any two nodes in the corresponding networks; see 
Section [0| Then, the size of an outbreak started from an 
arbitrary node is equal to the number of individuals that can 
be reached from the initial node by using only the occupied 
links in H. 

This isomorphism was claimed to hold first by Newman 
(3^ who studied the SIR model in single networks. It was 
later shown by several authors p^ , that the SIR process 

is isomorphic to bond percolation only when the infectious 
period distribution P{t) is degenerate; i.e., when all nodes 
have the same recovery time ti = • • • = r„. When nodes 
have heterogeneous recovery times, the SIR process is not 
isomorphic to a bond percolation process. However, p^ , 
p0| proved that, in the large network size limit, a bond 
percolation process can still be used to accurately predict 
the a) epidemic boundary, b) mean size of self-limited out¬ 
breaks, and c) relative final size of epidemics. With respect 
to our goals, it is only the probability P [epidemic] that can't 
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be obtained through analyzing the bond percolation model 
when the recovery times are heterogeneous; in fact, we are 
not aware of any technique in the literature that enables 
calculating P[epidemic] exactly in these cases. Therefore, we 
restrict our attention to cases where the recovery times are 
uniform when dealing with P[epidemic], while more general 
cases are considered for the boundary and final size of 
epidemics. To that end, our efforts towards analyzing in¬ 
formation propagation (e.g., items (i)-(iii) given above) rely 
on mapping the SIR model to a bond percolation process. 

We now explain how mapping the problem to a bond 
percolation process paves the way to obtaining the quan¬ 
tities (i)-(iii) given above. Let W (resp. F) be a network 
that contains only the occupied edges of W (resp. F). Put 
differently, consider an Erdos-Renyi (IT) network G{n;Tyj) 
(resp. G{J\fF]Tf)) on the nodes n} (resp. on the 

node set Afp) such that between every pair of nodes there 
is an edge with probability T^, (resp. Tf) independently 
from all other edges. Then, W = W n {G{n-,T^)) and 
F = F n (G(A/f;7/))- The bond percolation network El 
that contains only the occupied edges of El is then given 
by BI = W U F. The different transmissibility properties of 
W and F are already incorporated into this model through 
distinct bond occupation probabilities and Tf. Thus, El 
(defined on the vertices n}) is a simplex network 

obtained by a simple union of the edges of W and F. 

The threshold and relative final size of epidemics can 
now be computed from the phase transition behavior of El. 
Namely, epidemics can take place if and only if El has a 
giant component; i.e., a connected subgraph that contains a 
positive fraction of nodes in the large n limit. Thus, epidemic 
boundary is given by the phase transition threshold, i.e., 
the threshold for the existence of a giant component in H. 
Also, a node can trigger an epidemic only if it belongs to the 
giant component, in which case an outbreak started from 
this node will reach the whole giant component. Lienee, 
the relative size of the giant component in El gives both 
P [epidemic] as well as the relative final size of epidemics. 


3 Technical Background 

In what follows we introduce the technical underpinnings 
of our analysis. Our approach is based on exploring a 
branching process which starts with an arbitrary node in the 
network and recursively reveals all the nodes reached and 
informed by following its edges; see Figure Throughout, 
we will be interested in various discrete random variables 
naturally associated with this branching process; e.g., total 
number of nodes reached and informed by following a 
randomly selected edge in W (resp. in F). Oftentimes we 
find it useful to characterize the probability distributions 
of these random variables through their generating functions 
(4^ . This approach has been widely adopted in the litera¬ 
ture in analyzing complex networks and has several benefits 
as shall soon become apparent. 

We now formally define the notion of a generating func¬ 
tion: Let X be a positive-valued, discrete random variable 
with the distribution {pk : k = 0,1,...}; i.e., we have 


P(X = k) = Pk- Then the generating function of X is given 
by 

OO 

h{x) ='^ Pkx'^, a; G R. (7) 

k—0 

We remark that a random variable is uniquely identified by 
its generating function since we have 

Pk = h^^\x)/k\, fc = 0,1,... 

where h^^\x) denotes the A:* order derivative of h{x). 
Also, we can easily compute the moments of X from the 
derivatives of h{x) evaluated at the point x = 1 flT); e.g., 
the first moment is given through E[X] = h'{l), i.e., by the 
first derivative of h{x) evaluated at a; = 1. 



Fig. 4. Illustration of the branching process. The chiidren of each indi- 
viduai node is identified recursiveiy, whiie taking into account whether 
or not the information is transferred from the parent node to the chiid 
node. The initiai vertex that starts the information is regarded as the 
0*'' generation, and we are interested in deriving the iimiting behavior of 
the totai number of nodes reached and informed as the number nodes 

n —> OO. 


4 Main Results 


As described in Section 2.2 the clustered multilayer network 


in this paper consists of four kinds of edges, single edges in F, 
triangle edges in F, single edges in W, and triangle edges in W; 
these will be denoted by fs—, ft—, ws—, and wt—edges, 
respectively. In order to analyze the information propa¬ 
gation in multilayer networks, we consider a branching 
process that starts with informing a node selected randomly 
from among all nodes, {1,..., n}. We then explore all the 
neighbors that are reached and informed by this node, and 
continue recursively until the branching process stops. The 
distribution of the resulting number of nodes informed will 
be characterized via its generating function. 

We now explain our approach based on generating func¬ 
tions precisely. Let H{x) denote the generating function for 
the "finite number of nodes that are reached and informed" 
by the above branching process. We will derive an expres¬ 
sion for H{x) using four other generating functions hfs{x), 
hft{x), hyjs{x), and h^tix), where hfg{x) stands for the 
"finite number of nodes reached and informed by following 
a randomly selected fs-edge," and h^s {x) defined similarly 
for the u>s—edges. The definitions for hft{x) and h^tix) 
are a bit different in the sense that they correspond to the 
"finite number of nodes reached and informed by following 
a randomly selected triangle in F (resp. in W)" for hft{x) 
(resp. hwt{x)). In other words, we consider the whole trian¬ 
gle at once, rather than focusing on its edges separately; see 
see Section 
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With these definitions in place, we now write H{x) in 
terms of hfs(x), hft{x), h^isix), and 

H{x) = x^pdhfs{xY^‘hft{xY^*hy,s{xY'“‘’h^t{x)'^'“\ 

d 

( 8 ) 

where pd denotes the colored degree distribution given by 
1^. The validity of l[^ can be seen as follows. The term x 
stands for the node that is selected randomly and given 
the information to initiate the propagation. This node has 
a degree d = (d/g, 2nft,dws, ^riyjt) with probability pd- The 
number of nodes reached and informed by each of its dfs 
(resp. dws) single edges in F (resp. W) has a generating 
function hfs{x) (resp. hws{x)). Similarly, the number of 
nodes informed by following each of the rift (resp. n^jt) 
triangles it participates in F (resp. W) has a generating 
function hft{x) (resp. h^t{x)). Combining, we see from 
the powers property of generating functions p7| that the 
number of nodes reached and informed in this process when 
the initial node has degree d has a generating function 
hfs{xYf‘hft{x)'^^^h^s[xY^^hwt{xY'^K Averaging over all 
possible degrees d of the initial node, we get (|^. 

For to be useful, we shall derive expressions for the 
generating functions hfs{x), hft{x), hyj^x), and hyjt{x). 
As will become apparent soon, there are no explicit equa¬ 
tions defining these functions. Instead, we should seek for 
recursive equations defining each generating function in 
terms of others. Then, fixed points of this recursion will be 
explored and utilized to determine the threshold and size 
of information epidemics; i.e., situations where the number 
of people reached and informed by the original branching 
process is infinite. These steps are taken in the next sections 
where we first focus on deriving h {x) and h^g {x) (Section 


derive the epidemic threshold and final epidemic size. 


4.1' followed by derivations of hft(x) and h^tix) (Section 
4.2 . These arguments are then combined in Section |43| to 


4.1 Information Propagation via Singie Edges in Net¬ 
work F 

We start by deriving recursive equations for hfs{x) and 
hws{x), by focusing on the number of nodes reached and 
informed by following one end of a single edge in F and 
W, respectively. For instance, for h fg {x), we pick one of the 
single edges in F uniformly at random and assume that it 
is connected at one end a node who is in the infected state. 
Then, we compute the generating function for the number 
of nodes informed by following the other end of the edge. In 
what follows, we only derive hfg{x) since the computation 
of hws{x) follows in a very similar manner. 

Similar to we obtain the following expression for the 
generating function hfg{x): 


hfs(x) 

=TfxY: 


(9) 


dfgPd 

(dfs) 


hfg {x^f” ^ hft (x^f* h^g {xY'"‘‘ hwt {xY' 


+ Y-Tf). 


We now explain each term appearing at (|^ in turn. First of 
all, it is straightforward to see that if the selected edge is 
not occupied, which happens with probability 1 — Tf, then 



Fig. 5. The top vertex u is infected, and information is transferred through 
an edge oniy if it is occupied (happens vi/ith probabiiity Tf in network F). 


the number of informed nodes by following it will be zero. 
This leads to a term (1 — Tf)x^ in the generating function 
hfg{x). In words, adding the term (1 — Tf)x° to hfg{x) 
means that the probability of the underlying random vari¬ 
able (encoded by the generating function hfg{x)) being zero 
is incremented by 1 — Ty. On the other hand, if the selected 
edge is occupied, which happens with probability Tf, then 
the node at the other end of the edge will be informed. This 
means that the number of informed edges in this process 
will be one plus all the nodes that are then informed by the 
node at the other end of the selected edge. Adding one to a 
random variable is equivalent to multiplying its generating 
function by x, whence we get the term TfX. 

The summation term appearing at (|^ stands for the 
number of nodes informed by the aforementioned end node 
of the randomly selected edge, and is similar in vein with 
the summation term used in l[^ with two differences. First, 
the degree distribution of this end node is not pd since it 
is already known to have at least one single edge in F. 
Instead, its degree distribution will be proportional to dfsPd, 
and after proper normalization we see that the end node 
will have degree d = (d fg,2nft,d^g, 2nygt) with probability 
e.g., see ibl, iizl for similar arguments. Finally, if this 
node has degree omen the number of people it informs 
is generated by hfg{xYf^~'^hft{xYTh^g{xY'^‘’h.u]t{xY'^^, 
with the minus one term on dfg accounting to the fact that 
one of its single edges in F has carried the information to this 
node and has already been taken into account. Averaging 
over all possible d, we get l[^. 

4.2 Information Propagation via Triangies in Network F 

We now derive hft{x), i.e., the generating function for the 
number of nodes informed by following a random triangle 
in F; similar arguments hold for h^tix). We demonstrate 
this situation in Figure where the top vertex u is infected, 
and we are interested in computing the generating function 
for the number of nodes that will be informed by nodes v 
and w. Firstly, by conditioning on the state, i.e., occupied or 
not occupied, of the three edges forming this triangle, we 
compute the probabilities for neither, one, or both of v and 
w being informed, respectively. It is not difficult to see that 


{ P[none of v and w are informed] =(1 — Ty)^ 

P[one of V and w are informed] = 2Tf(l — TfY 
P[both of V and w are informed] = 2Ty (1 — Tf) + Tf. 
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We now explain why the above equations hold. Firstly, 
for V and w to be not informed, both of the edges u ^ v 
and u ^ w should be not occupied. By independence, this 
occurs with probability {l — Tf Y'. Secondly, we compute the 
probability of only one of v and w being informed, which 
by symmetry is given by two times the probability that v is 
informed but w is not. The latter happens if and only if the 
edge u ^ u is occupied while the edges u ^ w and v ^ w are 
not occupied. By independence, this has probability Tf{l — 
Tf)'^. Finally, probability that both v and w are informed is 
given by subtracting the first two probabilities from one. 

We now turn to computing the generating function 
hft {x) by conditioning on the three events discussed above. 
As in Section |4.1| if neither of the nodes v and w are in¬ 
formed, then the number of nodes informed by this triangle 
will be zero, leading to an additive term (1 — Tf)‘^x'^. Next, 
we derive the term corresponding to the case where only 
one of u or w is informed. This leads to 

(2r/(l-T/))x (10) 

• X! hft{x)^f*~'^htvsixf'^"h^t{x)'^'^*, 

where 2Tf {1 — Tf) stands for the probability of the condi¬ 
tioning event that only one of u or w is informed, and x 
stands for the node that is informed. As in Section |4.1| the 
degree distribution of this informed node will not be given 
by Pd, but instead will be proportional to the number of 
triangles nft assigned to it; as before this is due to the fact 


that the node under consideration is known to have at least 
one triangle in F. By normalization, we see that the degree of 
the node will be d = {dfs,2nft, d^s, 2nu,t) with probabilit 


xhe rest of the expression (10 ' follows similarly to 


("/t> _ ^ 

where a minus one term is invoK«l eA nft in order to not 
double count the triangle it, v, w that is being considered. 
Finally, the term corresponding to the case where both v 
and w are informed is easily computed as the square of (lO) 
as we use the powers property upon noting that v and w 
will inform independent sets of nodes under the enforced 
assumptions. Collecting, we obtain 

hft{x) (11) 

= (1 - Tff + {2Tf (1 - Tff)xY, (^^hfs{xY^‘ 

■ + (2r|(l - Tf) + Tf) 


4.3 Computing the Final Epidemic Size 

We are now in a position to write the recursive equations 
for generating functions hfs{x), hft{x), hws{x), and hwt{x), 
whose solution will be reported into l[^ to get the final 
epidemic size. Using (|^ and ( [Tl) and similar expressions 
for hws{x) and hwt{x), we obtain 


hfs{x) = TfxY^ ^hftix)"'f^hy,s{x)‘^'^^h^t{x)'^'“* + (1 - Tf), 

a {d'fs) 

hftix) = {2Tf (1 - T/)") 

+ {2Tf{l - Tf) + Tf) 1^01^ + (1 - Tf)\ 

h^s{x) = T^xY^ dw^Pd (x)"^* {x)^'^‘~^hwt (x)"“* -F (1 - T^), 

hwt{x) = (2T^ (1 - tY^^xY dfth^s{x)'^'""h^t (x)"“*~^ 

+ {2Tl{l - T^) + Tl) ^^/i/,(x)‘^^=/i/i(x)"^*/i™,(x)‘'™»/i^*(x)"“‘-i^ -F (1 - T^)\ 


( 12 ) 


(13) 

(14) 


(15) 


The desired generating function H{x) for the finite num¬ 
ber of nodes informed in the network can now be computed 
in the following manner. For any x, we solve the recursive 
relations (T^ - ( [15) , i.e., find a fixed point of ( [I^ - ( [15) . Then 
reporting the resulting values of hfs (x), hft /i J7{x), and 

hwt{x) into l|^, we obtain H{x) for this particular value 
of X. Repeating the same process for any x will lead to a 
complete characterization of H{x). However, in this work 
we are only interested in the cases where the number of 
nodes informed by the process is infinite. More precisely, 
we wish to derive i) the conditions for the probability of 


informing a positive fraction of nodes to be larger than zero 
in the large n limit; and (ii) the exact asymptotic fraction 
of informed individuals when the conditions of part (i) 
hold. As explained in Section 12.4] the latter also gives the 
probability of triggering an epidemic starting with a random 
node. 

In order to achieve these goals, we take advantage of 
the "conservation of probability" property of generating 
functions, i.e., the fact that H{1) = 1 when the number 
of nodes reached and informed is always finite. If on the 
other hand H{1) < 1, we understand that there is a positive 
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probability 1 — H{1) for the aforementioned branching 
process to lead to an infinite component of informed nodes; 
i.e., for the branching process to be supercritical. In this case, 
1 — H{1) stands for the fraction of nodes that are in the 
giant component of H. Recalling the discussion in Section 


an epidemic if and only if the initiator node is in this giant 
component. Thus, we conclude that the probability of an 
epidemic is given by 1 — H{1) and so is the relative final 
size of epidemics. 

With these in mind, we now seek for a fixed point of 
the recursion | (T^ - (Is) at the point x = 1. For notational 
convenience, we define hi := hfs{l), /12 := hft{l), := 
/iu,s(l), and /14 := The recursion | [T^ - l [l^ then 

takes the form 


2.4 


we know that information propagation will turn into 


hi = gi{hi,h2,h3,h4), f = 1,2, 3,4 (16) 

where gi, g 2 , gs, and g^ are functions immediately obtain¬ 
able from ( 12 ' - | (T5| ; e.g., we have 

giihi,h2,h3,h4) 


3 = 


Tf 


{4s-dfs) 

(dj,) 


2Tf{l + Tf 


rr2\(dfsnft) 


rp (d^wndfs) 




rp2 \ (riuitdfs) 


Tf 

2Tfil+Tf 


(dfsTlft) 

(dj,) 

_ rp2\ 

{nft) 


rp {d^sUjt) 

2T^{1+T^-Tf,) 


(»™t> 


With this notation, we also have 

H{1) = (17) 

d 

It is easy to check that the recursion - ( [ 15 } exhibits a 
trivial fixed point hi = h 2 = h^ = h 4 = 1, which leads to 
H{1) = 1, meaning that the branching process is sub-critical 
and all informed components have finite size. However, the 
solution hi = h 2 — hs = h 4 — 1 is stable only when it is 
an attractor; i.e., a stable fixed point. We check the stability 
of this solution via linearization of | [T^ - (isl around x = 1, 
which leads to Jacobian matrix J whose entries are given by 


hi — h2—h^—h4 — l 

for each i,j = 1,2, 3,4. Namely, we have 


J(bj) = 


dgi{hi,h2,h3,h4) 


dhi 


rp jdfsd^s) 

(df.) 

2 r/(i + T;-r|)iy^ 

rj. {dl,-d^s) 

nrp (I _ 7^2'1 

4 -J-w (n„t) 


rp {d^snjrtf 

{dj.) 

2T^(l + r^-T|)iy^ 

rp {d^sn^t) 

2T Cl -l-T 7^2', 

4 -J-W („ ,) J 

(18) 


Now, if the largest eigenvalue in absolute value of the 
Jacobian matrix J, denoted by cr(J), is less than or equal 
to one, then the trivial solution mentioned above is an 
attractor, whence all informed components have finite size 
as understood from the conservation of probability; i.e., 
from H{1) = 1. However, if ct(J) > 1, then the triv¬ 
ial solution will not be stable and another solution with 
hi,h 2 ,h 3 ,h 4 < 1 will exist. This then will lead to having 
H{1) < 1 meaning that information epidemics take place 
with probability l—iT(l) > 0 and reach an expected fraction 
1 — iT ( 1 ) of the whole population, where H{1) is computed 
from (|^. 

Collecting, the threshold of information epidemics is 
given by (t(J) = 1, where cr(J) is the spectral radius of 
the Jacobian matrix given at (18) . Also, the mean epidemic 
size (i.e., the fractional size of the giant component of the 
percolated network H) can be computed by first finding 
the pointwise smallest solution of the recursion (T^ , and 
then reporting the result into (17 to get iT(1). As discussed 
before, the mean size of epidemics is given by 1 — H{1). 


4.4 The Relationship between Our Analysis and Some 
Previous Studies 

Our results generalize some of the existing work in the 
literature; e.g., see |6|, 120], |^. First, by letting hfs{x) = 
^ft{x) = 1 in ||T^-||T^, we ensure that F is an empty 


graph, so that our system model is equivalent to the single 
clustered network considered in | [2()] |, fH] . Similarly, if we 
set hft{x) = hu]t{x) = 1 then neither F nor W will have 
triangle edges, rendering our system to be equivalent to the 
non-clustered multi-layer network studied in ||^. A careful 
inspection of our results will reveal that in both special 
cases, our results recover the finding of ||^, |j^, pT) . 


5 Numerical Results AND Discussion 


This section is devoted to presenting numerical results with 
regard to information propagation in clustered multi-layer 
networks in specific settings with given degree distribu¬ 
tions. In what follows, we first consider a simple case where 
both constituent networks in our model has doubly-Poisson 
degree distributions Pst, while in Section 15.21 we consider 
the more realistic case where Pst is a power-law degree 
distribution with exponential cut-off. Section 5.3 and Section 
5.4 are devoted to understanding the impact of clustering 
and of the parameter a on the dynamics of information 
propagation, respectively. 


5.1 Networks with Doubly Poisson Distributions 

Consider the case where both and pf^ are doubly Pois¬ 
son; i.e., the number of single edges and triangles in both 































IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, 2015 


9 


networks are independent and they all follow a Poisson 
distribution. Namely, we set 




Pst = e 

and 


M /.2 


{Pf,- 


SI 


Pst - e s! 


iPw,2Y 

tl 


s, t = 1, 2, ..., (19) 


s, t = 1, 2, ..., (20) 


where s and t are the number of single edges and triangles in 
the corresponding networks while and /r /^2 (resp. 
and ^u], 2 ) are the mean number of them respectively in F 
(resp. in W). 

Under this setting, the mean epidemic size as well as 
the epidemic threshold can be computed from the analyt¬ 


ical results presented in Section 4.3 To check the validity 


of our analysis for finite-sized networks, we have also 
conducted an extensive numerical study. In particular, we 
consider n = 5 x 10® nodes in the population and three 
different values a = 0.1,0.5, 0.9 for the size of network 
F. We let = ^fs = ^ft = 0.5 and similarly 

Pw,i = Pw ,2 = ^ws = Au)t = 0.5. For various informa¬ 
tion transmissibility parameters = Tf we generate 100 
independent realizations of the multi-layer network H and 
compute the size of the largest connected component (of 
the percolated network H) in each case. The results are then 
averaged over 100 experiments to obtain the enifirical size 
of information epidemics. 

The results are depicted in Figure where the curves 
stand for the theoretical results obtained from our discus¬ 
sion in Section |T3] while the markers stand for the empirical 
results obtained from simulation experiments. We see that 
there is a perfect agreement between the analytical and 
experimental results confirming the validity of our results 
even when n is finite. We also see that as a increases, 
the critical threshold is reduced and the epidemics size is 
enlarged. This is an intuitive consequence given that the 
network becomes denser with increasing a. A more detailed 
discussion on the impact of the parameter a on the charac¬ 
teristics of information propagation in a multi-layer network 
is provided in Section 5.4 below. 


5.2 Networks with Power-law Degree Distributions 

Many real-world networks including the Internet (at the 
level of autonomous systems), the phone call network, the 



Fig. 6. Simulation for doubly Poisson degree distributions. 



Fig. 7. Simulation for Power-law degree distributions. 


e-mail network, and the web link network are shown to 
exhibit power law degree distributions with exponential 
cut-off pS) . To gain more insight about our results for more 
realistic network models, we next consider the case where 
both F and W have power-law degree distribution with 
exponential cut-off. Namely, we have 


/ 

Pst = 


0 , 


s = 0 or t = 0, 
s, t = 1, 2, ... ’ 

( 21 ) 


and 


Pst = 


0 , 

_ 



. 2 )’ 


s = 0 or t = 0, 
s, t = 1, 2, ... ’ 

( 22 ) 


where ^ is the m*" polylogarithm of z. 

In order to compute an analytical expression for the 
size of information epidemics we proceed similarly with 
the case of doubly Foisson distributions and use our results 
presented in Section |43| 

For computer simulations, we again set n = 2 x 10®, and 
use a = 0.1,0.5, 0.9 as three sample sizes for the network 
F. The corresponding degree distributions are given by 
pi) and p3 with = 7/^2 = lw,i = lw ,2 = 2.5, 


and r 


/.I 


= r/.2 = r 


w^l 


= r 


w,2 


= 10. With Tf = 
ranging from zero to one, we compute the empirical size 
of the information epidemics again via averaging over 100 
independent experiments. The results are demonstrated in 
Figure where curves are obtained analytically using our 
discussion in Section 4.3 and markers represent numerical 
results. We again see a perfect agreement between our 
analysis and numerical results. 


5.3 How does Clustering Affect the Threshold and Size 
of Information Epidemics? 

An important goal of this work is to understand how 
clustering affects the dynamics of information propagation 
in multi-layer networks. Given the complexity of the model 
adopted here this can be studied in several different ways; 
e.g. with controlling the clustering coefficient of only one 
of the networks F or W, or by adjusting both networks' 
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clustering simultaneously. Also, the way we adjust the clus¬ 
tering coefficient of a given network can have a significant 
impact on the conclusions obtained given that such changes 
might also impact the d^ree-degree correlations (e.g., assor- 
tativity) in the networUd The situation becomes even more 
involved as one realizes that the choice of the parameter a 
changes the assortativity of the network as well. 

With these in mind, we consider doubly Poisson distri¬ 
butions in the remainder of this discussion. We first consider 
a scenario where one or both of the constituent networks 
in the system is changed from a non-clustered network 
to a clustered network. More precisely, we compare the 
following three cases 

• Both networks are non-clustered (NN) 

• Network W is clustered but network F is non¬ 
clustered (NC) 

• Both networks are clustered (CC) 


TTere, the clustered networks are generated as discussed 
in Section 2.1 following the approach of Miller | [20| | and 
Newman say with doubly Poisson degree distribution 
Pst with parameter As for single edges and At for trian¬ 
gle edges. To ensure a fair comparison, we generate non¬ 
clustered networks with the same total degree distribution 
and degree-degree correlations. To this end, we generate the 
non-clustered networks using the multiplex (i.e., colored) 
version of the configuration model |^. Namely, each node 
gets Poi(As) stubs of color blue and 2 x Poi(At) stubs of 
color red, and then stubs of the same color are randomly 
matched to form edges. The standard configuration model 
where colors are ignored would lead to the same degree 
distribution, but would fail in capturing the positive degree 
correlations inherent in the random clustered networks pro¬ 
posed by Miller (20) and Newman pT). 

The results are depicted in Figure ® where we compare 
the relative size of the epidemics as varies from 

zero to one, and a is taken to be 0.1, 0.5, or 0.9. The 
resulting global and local clustering coefficients, and the 
assortativity values of network H can be found in Table 

For each a value, we see that clustering increases as 
we go from NN to NC to CC, while assortativity stays 
the same. Our main conclusion from Figure is that for 
a given a, the curve for the NN is always above that of 
NC, which in turn is always above that of CC. That is, the 
critical threshold of information epidemics increases while 
the final epidemic size decreases as we move from NN to 
NC to CC, i.e., as the clustering coefficient in the whole 
system increases. Therefore, we conclude that the high level 
of clustering not only makes it more difficult for information 
to reach a significant fraction of the population, but it also 
reduces the mean epidemic size at any level of information 
transmissibility. 

The inhibitive effect of clustering on epidemics has been 
observed in the single network case as well pO) , and is often 
attributed to the fact that the edges used for completing 
wedges to triangles is redundant for the purposes of in¬ 
formation propagation; a wedge is defined as a connected 
triple that is not a triangle. This is particularly evident when 


2. The assortativity coefficient is the Pearson correlation coefficient of 
degree between pair s of linked nodes, and the detail of the computation 
can be found in |44|. 



Fig. 8. Comparison of the size of information epidemics between 
Non-ciustered and Non-ciustered networks (NN), Ciustered and Non- 
ciustered networks (NC), and Ciustered and Ciustered networks (CC). 
Riots are obtained from our anaiyticai resuits. The vaiue foiiowing the 
modei abbreviation indicates the amount of overiapping between two 
networks. For exampie, NC-0.5 means that a = 0.5. 


Tf = Tys = 1, and the size of the epidemics is equal to 
the giant component size in H. It is clear that adding an 
extra edge to this graph that transforms a wedge into a 
triangle has no effect on its giant component; in contrast 
it may be possible to increase the giant component size 
by adding this extra edge somewhere else in the network. 
Therefore, as long as the degree distributions and degree- 
degree correlations are fixed, random networks with low 
clustering will tend to have a larger epidemic size and a 
lower epidemic threshold. 

In order to understand the effect of clustering better, we 
next consider a different setting where we control the level 
of clustering in network W while keeping its mean total 
degree fixed. More precisely, we use Foisson distributions 
for the number of single and triangle edges in both networks 
with parameters given in Table Fut differently, network F 
has a fixed clustering coefficient while with c € [0,4] the 
clustering of W varies between the two extremes: i) when 
c = 4, W will have no single-edges and consist only of 
triangles resulting with a clustering coefficient close to one; 
and ii) with c = 0, there will be no triangles in W and hence 
its clustering coefficient will be close to zero. Thus, with 
increasing c, the clustering coefficient of W increases, which 
in turn increases clustering in the multilayer network H; see 
Table for specific clustering coefficients corresponding to 
several c values considered. We remark that by the choice 
given in Table the degree distribution (single edges plus 
triangle edges) of W is given by 

2P„i(i^A)+2Poi(HA). 

This ensures that as c varies both the mean and the variance 
of the degree distribution remains constant, allowing us to 
focus only on the effect of clustering; for instance, using 
Foi((4 — c)A) rather than 2Foi(^^A) would change the 
variance of the distribution and hence the threshold for 
information epidemics (viz. (|^). As seen from Table 
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Non-clustering and Non-clustering 

Non-clustering and Clustering 

Clustering and Clustering 

a 

Global Coeff. 

Local Coeff. 

Assortativy 

Global Coeff. 

Local Coeff. 

Assortativy 

Global Coeff. 

Local Coeff. 

Assortativy 

0.1 

0 

0 

0.106 

0.25 

0.48 

0.106 

0.27 

0.51 

0.106 

0.5 

0 

0 

0.071 

0.14 

0.31 

0.071 

0.21 

0.42 

0.071 

0.9 

0 

0 

0.044 

0.01 

0.19 

0.044 

0.19 

0.35 

0.044 


TABLE 1 

Statistics of network H under the setting of Figure]^ 


the assortativity of the network also remains constant with a 

c 

assortativity 

Clust. C 
Global 

oefficients 

Local 

varying c. 

Network F 

0.1 

Network W 

0.01 

2.00 

3.99 

0.010 

0.010 

0.010 

0.005 

0.095 

0.185 

0.006 

0.230 

0.453 

Distribution of single-edges 

Distribution of triangles 

Poi(2AF) 

Poi(AF) 

2Poi(^Aw)^ 

0.9 

Poi(|Aw) 

0.01 

2.00 

3.99 

0.009 

0.009 

0.009 

0.023 

0.075 

0.126 

0.044 

0.152 

0.260 


TABLE 2 

Parameters of the doubiy Poisson distribution. In Figurej^we set 
Af = Aw = 0.5. We use Af = 0.36 and Aw = 0.5 for FigurepfOl 


Statistics corresponding to the network H in the setting of Figure]^ 


With these in mind, we first demonstrate in Figure]^ the 
boundary of the Tf — Tyj plane that identifies the threshold 
of information epidemics. Put differently, for each parameter 
pair (c, a), the curves in Figure|^separates the region where 
information epidemics can take place (north and east of the 
curves) from the region where they can not (south-west 
of the curves). We see that with the same Tf, clustering 
increases the minimum that is needed for information 
epidemics to be possible. In other words, we see again that 
clustering increases the threshold of epidemics. 

Next, we look at the effect of clustering on the relative 
final size of information epidemics for specific percolation 
(i.e., transmissibility) probabilities. From Figure we see 
that the size of giant component decreases as the clustering 
coefficient increases, again confirming that high clustering 
reduces the epidemic size. 



Fig. 9. Comparison of the epidemic boundary under several cases; the 
north and east of each curve specifies the region of {Tf,Tn,) values 
for which epidemics are possible, while the south and west part of 
each curve stands for the region where epidemics can not take place. 
Resulting statistics for clustering and assortativity is given in Table]^ 


5.4 How does a Affect the Information Propagation Dy¬ 
namics? 

We now shift our focus to understanding the impact of the 
parameter a, which controls the relative size of network F 
to network W, on the information propagation dynamics. 
From a practical perspective, this will help understand the 



Fig. 10. Illustration of how clustering affects the size of epidemics when 

Tf = Tu, = 0.3. 

role of the size of an online social network, say Facebook, on 
propagating the information. As we shall demonstrate soon, 
this parameter's impact on the overall network topology 
goes beyond a change in degree distribution, and thus its 
effect on epidemic threshold and epidemic size are highly 
non-trivial. 

In order to focus only on the impact of a, we consider 
non-clustered networks throughout this section; however, 
a similar discussion would hold for clustered networks as 
well. Let W and F be random networks with given degree 
distributions, and let El = W ]J F be their disjoint union; 
i.e., network El is a colored degree-driven random graph 
introduced in (4^. As before W is defined on n vertices, 
each of which belongs to vertex set of F independently with 
probability a. For simplicity, we assume that network W 
has Poisson degree distribution with parameter Ared, while 
network F has degree distribution Poi(Abiue)- Under this 
setting, each of the n nodes in graph El will have a colored 
degree distribution given by 

Pk‘^ = k = 0, ..., 

pUue _ Q,g-Abi„e^;^ -F (1 — a)l[fc = 0], k = 0,.... 

(23) 

where blue edges represent links in F and red edges rep¬ 
resent links in W. The multiplex network (MN) El is then 
generated by the colored configuration model where only 
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stubs of the same color are connected together to form an 
edge. 

To check the impact of a on the size of information 
epidemics in a fair way, we keep the value of aXuue fixed 
throughout the experiments. This ensures that the mean 
number of blue edges in El remains constant as a varies. 
Put differently, this setting allows us to compare the impact 
of a small but densely connected social network with a large 
but loosely connected one in facilitating the propagation of 
information. Below, we will argue why an adjustment on 
a changes not only the degree distribution but also the 
degree-degree correlations in the network. To make this 
point clearer, we also include in our comparison the simplex 
network (SN) case which ignores the colors of the edges 
and generates El via the standard configuration model with 
degree distribution pk = here © denotes the 

convolution operator. 

The results comparing the final epidemic sizes for three 
specific a values are given in Figure These plots are 
obtained via computer simulations with n = 5 x 10®, 
-^red = 1/ ctAbiue = 1/ ^rid = Tf is varied from zero 
to one; each data point corresponds to an average over 100 
independent runs. We list the resulting assortativity values 
for each case in Table As expected, the simplex case that 
corresponds to the standard configuration model has uncor¬ 
related degrees and thus the resulting assortativity is zero. 
However, we realize that aside from changing the degree 
distribution in the network, the relative size of F also has a 
significant impact on the degree-degree correlations in the 
multiplex case. This impact, namely the positive correlations 
observed between the degrees of neighbors, is particularly 
pronounced in the case where a is small; e.g., assortativity 
is 0.96 when a = 0.01. This can be attributed to the fact 
that when a is close to zero, a very small fraction of nodes 
receive a large number of blue edges (since aAbiue is fixed) 
and these extra edges can only be used to connect with other 
nodes that also have extra edges; as before red edges are 
assigned to every node. As a result, the network El exhibits 
a very densely connected (community-like) subgraph on 
the vertices that participate in F, and this leads to highly 
positive degree-degree correlations given that the nodes in 
F have significantly larger (in the statistical sense) degrees 
than nodes that are not in F. 


OL = 

0.01 

a = 

0.10 

OL = 

0.99 

SN 

MN 

SN 

MN 

SN 

MN 

0.00 

0.96 

0.00 

TAB 

0.65 

LE4 

0.00 

0.00 


Comparison of the assortativity vaiues observed in the setting of Figure 
[iTIfor to the Simpiex Network (SN) and the Muitipiex Network (MN) 
case for different a. As expected, for the simpiex network case the 
degrees of the nodes are uncorreiated and assortativity is thus zero. 
The muitipiex case exhibits assortative mixing, with the correiations 
getting more significant with decreasing a. 


There are a number of interesting conclusions we can 
derive from Figure First, by comparing the simplex and 
multiplex cases with each other for each a value (i.e., by 
comparing the line and the marker that are of the same color 
in Figure]^ we see that multiplex networks have a smaller 
epidemic threshold as well as a smaller epidemic size as 
compared to the corresponding simplex network for small 



Fig. 11. iiiustration of the effect of a. SN is the abbreviation for Simpiex 
Network where coiors of the edges are ignored, whiie MN indicates 
the Muitipiex Network case where oniy stubs of the same coior are 
connected together. (Inset) The plots for a = 0.01 are shown at a higher 
resolution near the phase transition point. 


a values; on the other hand for a ~ 1, the differences are 
negligible. This observation is in line with the assortativity 
values seen in Table noting the fact that assortativity is 
known (20) , | [3^ | to reduce the critical threshold and the size 
of epidemics. 

Second, we focus on the impact of a on the threshold 
and size of epidemics by comparing the three lines in Figure 
11 that correspond to the multiplex case with a = 0.01, 


a = 0.1 and a = 0.99, respectively. We observe that as a 
gets larger, the epidemic threshold increases and so does 
the final epidemic size. What this means is that, it will be 
more difficult to trigger an information epidemic when the 
physical network is augmented with a large online social 
network that is loosely connected, as compared to the case 
when the online social network is small but densely con¬ 
nected. However, when information transmissibility is high 
in both networks, final epidemic size is going to be larger 
in the case of a large but loosely connected online social 
network as compared to the case of a small but densely 
connected one. Combining, we conclude that information 
with low transmissibility spreads more effectively with a 
small but densely connected social network, whereas highly 
transmissible information will reach more people with the 
help of a large but loosely connected social network; here 
the basis of comparison is again the total number edges in 
the overlay network. 

It is important to remark that the differences observed 
between the three lines for the multiplex case may not be 
solely attributed to the changes in the assortativity levels. 
This is because when we adjust a, the degree distribution of 
the network changes as well. For example, it is easy to see 
that as a increases the variance of the degree distribution 
tends to be lower, which is known [ [33| to increase the 
epidemic threshold; i.e., it has a similar impact on the 
epidemic threshold with reducing the assortativity. To better 
understand the impact of a on the degree distribution and 
hence on the information propagation dynamics, we now 
compare the three simplex network cases in Figure i.e., 
we compare the data points shown with markers. This time 
as well, we see that as a gets larger the epidemic threshold 
and the final epidemic size gets larger, although the dif¬ 
ferences observed are less significant as compared to the 
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multiplex case discussed above. Intuitively speaking, this 
would be expected since none of these three cases exhibit 
assortativity and the observed impact is only due to the 
change in the degree distribution. The observed impact of 
a on the degree distribution and on the epidemic threshold 
can in fact be quantified. First, we recall that for a single 
layer network, the critical threshold for epidemics is given 
by @ 


- 1 )] 

E[d*] 


(24) 


where di is the degree of an arbitrary node i. With the choice 
of the degree distribution given in ( |23| and Tf = = T, it 

is easy to see that this condition reduces to 


C^Ablue T Ared T 


O^Ablue 4“ Ared 



or, equivalently to 


T> 


2 

3 + 1/a 


with our choices of aAbiue = Ared = 1- This finding quanti¬ 
fies how the critical threshold should increase with a, and it 
is in perfect agreement with the curves for the simplex case 
shown in Figure]^ as expected. 


6 Conclusion 

We analyze the propagation of information in clustered 
multilayer networks, where the vertex set of one network is 
a subset of the vertex set of the other. We solve analytically 
for the threshold, probability, and mean size of information 
epidemics, and confirm our findings via extensive computer 
simulations. We show from various angles that clustering 
increases the epidemic threshold and decreases the final 
epidemic size in multi-layer networks. We also demonstrate 
how the overlap between the constituent networks affects 
the information propagation dynamics, particularly through 
impacting the degree-degree correlations. For instance, we 
show that an online social network F that is small in size but 
large in mean connectivity is more effective in facilitating 
the propagation of information as compared to a large 
social network with smaller mean connectivity, with the 
total number of edges fixed in both cases. 

Our general framework contains non-clustered multi¬ 
layer networks and single-layer clustered networks as spe¬ 
cial cases. In addition, given that information propagation 
problem is studied via bond percolation over a multi-layer 
network, our work can also be useful in the context of 
robustness against random attacks - Assume that our system 
consists of two conjoint networks F and W and an adversary 
attacks edges in both networks randomly with probabilities 
Tf and T^, respectively with the aim of disconnecting the 
whole system. Then, the size and existence of the giant 
component after edge failures would be natural metrics for 
the robustness of this system against random attacks. To that 
end, we believe our results (e.g.. Figure would be useful 
in understanding the impact of clustering on the robustness 
of multi-layer networks. 

There are many open problems one might consider for 
future work. For instance, the impact of assortativity is not 
fully understood on the propagation of information over 


multi-layer networks. Another interesting direction would 
be to consider networks that exhibit clustering not only 
through triangles, but also through larger cliques. Extending 
some of the ideas presented here to the case of influence 
propagation (e.g., complex contagions) would also be inter¬ 
esting. 
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